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R.L.Karandikar, T.Krishnan, and M.R.L.N.Panchanana 


Monte Carlo methods (Fishman, 1996; Gentle, 1998; Robert and Casella, 2004; 
Gamerman and Lopes, 2006) are used to estimate functional of a distribution function 
using the generated random samples. SYSTAT provides Random Sampling, IID MC, 
and MCMC algorithms to generate random samples from the required target 
distribution. 

Random Sampling in SYSTAT enables the user to draw a number of samples, each 
of a given size, from a distribution chosen from a list of 42 distributions (discrete and 
continuous, univariate and multivariate) with given parameters. 

To simulate more complex models SYSTAT also provides random sampling from 
univariate finite mixtures. 

If no method is known for direct generation of random samples from a given 
distribution or when the density is not completely specified, then IID Monte Carlo 
methods may often be suitable. The IID Monte Carlo algorithms in SYSTAT are 
usable only to generate random samples from univariate continuous distributions. IID 
Monte Carlo consists of two generic algorithms: Rejection Sampling and Adaptive 
Rejection Sampling (ARS). In these methods an envelope (proposal) function for the 
target density is used. The proposal density is such that it is feasible to draw a random 
sample from it. In Rejection Sampling, the proposal distribution can be selected from 
SYSTAT’s list of 20 univariate continuous distributions. In ARS, the algorithm itself 
constructs an envelope (proposal) function. The ARS algorithm is applicable only for 
log-concave target densities. 

A Markov chain Monte Carlo (MCMC) method is used when it is possible to 
generate an ergodic Markov chain whose stationary distribution is the required target 
distribution. SYSTAT provides two classes of MCMC algorithms: Metropolis— 
Hastings (M-H) algorithm and the Gibbs sampling algorithm. With the M-H 
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algorithm, random samples can be generated from univariate distributions. Three types 
of the Metropolis-Hastings algorithm are available in SYSTAT: Random Walk 
Metropolis-Hastings algorithm (RWM-H), Independent Metropolis-Hastings 
algorithm (IndM-H), and a hybrid Metropolis-Hastings algorithm of the two. The 
choice of the proposal distribution in the Metropolis-Hastings algorithms is restricted 
to SYSTAT’s list of 20 univariate continuous distributions. The Gibbs Sampling 
method provided is limited to the situation where full conditional univariate 
distributions are defined from SYSTAT’s library of univariate distributions. It is 
advisable for the user to provide a suitable initial value/distribution for the MCMC 
algorithms. No convergence diagnostics are provided and it is up to the user to suggest 
the burn-in period and gap in the MCMC algorithms. 

From the generated random samples, estimates of means of user-given functions of 
the random variable under study can be computed along with their variance estimates, 
relying on the law of large numbers. A Monte Carlo Integration method can be used in 
evaluating the expectation of a functional form. SYSTAT provides two Monte Carlo 
Integration methods: Classical Monte Carlo integration and Importance Sampling 
procedures. 

HD MC and MCMC algorithms of SYSTAT generate random samples from 
positive functions only. Samples generated by the Random Sampling, IID MC and 
MCMC algorithms can be saved. 

The user has a large role to play in the use of the IID MC and MCMC features of 


SYSTAT and the success of the computations will depend largely on the user's 
judicious inputs. 


Statistical Background 


Drawing random samples from a given probability distribution is an important 
component of any statistical Monte Carlo simulation exercise. This is usually followed 
by statistical computations from the drawn samples, which can be described as Monte 
Carlo integration. The random samples drawn can be used for the desired Monte Carlo 
integration computations using SYSTAT. SYSTAT provides direct random sampling 
facilities from a list of 42 univariate and multivariate discrete and continuous 
distributions. Indeed, in statistical practice, one has to draw random samples from 
several other distributions, some of which are difficult to draw direct] y from. The 


generic IID Monte Carlo and Markov chain Monte Carlo algorithms that are provided 


by SYSTAT will be of help in these contexts. The random sampling facility from the 
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standard distributions is a significant resource, which can be used effectively in these 
generic IID and Markov chain Monte Carlo procedures. 


Random Sampling 


The random sampling procedure can be used to generate random samples from the 
distributions that are most commonly used for statistical work. SYSTAT implements, 
as far as possible, the most efficient algorithms for generating samples from a given 
type of distribution. All these depend on generation of uniform random numbers, based 
on the Mersenne-Twister algorithm and Wichmann-Hill algorithm. 


= Mersenne-Twister (MT) is a pseudorandom number generator developed by 
Makoto Matsumoto and Takuji Nishimura (1998). Random seed for the algorithm 
can be mentioned by using RSEED= seed, where seed is any integer from 1 to 
4294967295 for the MT algorithm and 1 to 30000 for the Wichmann-Hill 
algorithm. We recommend the MT option, especially if the random numbers to be 
generated in your Monte Carlo studies is fairly large, say more than 10,000. 


If you would like to reproduce results involving random number generation from 
earlier SYSTAT versions, with old command file or otherwise, make sure that your 
random number generation option (under Edit => Options => General => Random 
Number Generation) is Wichmann-Hill (and, of course, that your seed is the same as 


before). 


The list of distributions SYSTAT generates from, expressions for associated functions, 
notations used and references to their properties are given in the Volume: Data: Chapter 
4: Data Transformations: Distribution Functions. Definitions of multivariate 

distributions, notations used and, references to their properties can be found later in this 


chapter. 


Rejection Sampling 


Rejection Sampling is used when direct generation of a random sample from the target 
density is difficult or when the density is specified but for a constant as fx(x), but a 
related density gy(y) is available from which it is comparatively easy to generate 
random samples from. This gy(y) is called the majorizing density function or an 
envelope and it should satisfy the condition that Mgy(x) 2 fx(x) for every x, for some 
constant 0<M<ee For the method to work, it should be easy to draw random samples 
from the distribution defined by the density function gy(.). 
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Rejection Sampling Algorithm 


Step 1. Draw y from gy(.) 
Step 2. Draw u from Uniform(0,1). 


Step 3. If u < fy(y)/Mgy(y) then accept y as the desired realization;else return to 
Step 1 


Step 4. Repeat until one y is accepted 


Repeat this algorithm to select the desired number of samples. For distributions with 
finite support and bounded density, gy(.) can always be chosen as uniform. 


A SYSTAT user can select a proposal density from a list of univariate continuous 
density functions and a positive number M < ¢ If target and proposal functions are 
probability density functions, and M (1 < M < à is the value of a majorizing constant, 
then M can be interpreted as the average number of proposal variates required to obtain 
one sample unit from the target function. The probability of accepting a chosen 
proposal variate is large if the value of M is small. The requirements that must be 
satisfied in rejection sampling are: 


m the supremum of the ratio of target and proposal functions should be bounded, 
m the constant M should be an upper bound to this supremum ratio, and 


m the support of the target distribution must be a subset of the support of the proposal 
distribution. 


The success of random sample generation from a target function using the Rejection 


Sampling algorithm depends mainly on the choice of the proposal density function and 
the majorizing constant M. 


Adaptive Rejection Sampling (ARS) 


When Rejection Sampling is intended to be used to generate random samples from a 
target function, finding an appropriate proposal density function and a majorizing 
constant M may not be easy. In the Adaptive Rejection Sampling (Gilks, 1992; Gilks 
and Wild, 1992; Robert and Casella, 2004) method, the proposal density function is 
constructed as an (polygonal) envelope of the target function (on the log scale). The 
target function should be log-concave for this method to work. The envelope is updated 
whenever a proposal variate is rejected, so that the envelope moves closer to the target 
function by increasing the probability of accepting a candidate variate. The above 
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references may be consulted for details of the ARS algorithm. Since initial points on 
the abscissa are essential for the ARS algorithm, SYSTAT requires the user to specify 
two starting points to enable it to generate initial points between them for target 
distributions whose support is unbounded, left and right bounded, and bounded. For a 
target which has unbounded support, the two specified points are starting points; for a 
left (right) bounded support, the left (right) point is a bound and the other point is a 
starting point used to create initial points between the bound and itself; and for a 
bounded support, both the specified points are bounds. 


Metropolis-Hastings (M-H) Algorithm 


There are limitations in the applicability of the ARS algorithm. It is applicable only to 
generate from univariate distributions and that too only with log-concave densities. 
However, there is need to generate random samples from non-log-concave densities 
and from multivariate densities. In the multivariate case, the target distribution is often 
defined indirectly and/or incompletely, depending on the way it arises in the statistical 
problem on hand. However, so long as the target distribution is uniquely defined from 
the given specifications, it is possible to adopt an iterative random sampling (Monte 
Carlo) procedure, which at the point of convergence delivers a random draw from the 
target distribution. These iterative Monte Carlo procedures generate a random 
sequence with the Markovian property such that the Markov chain is ergodic with a 
limiting distribution coinciding with the target distribution, if the procedure is suitably 
chosen. There is a whole family of such iterative procedures collectively called MCMC 
procedures, with different procedures being suitable for different situations. The 
Metropolis-Hastings (M-H) algorithm of SYSTAT generates random samples from 
only univariate distributions, although the algorithm in general can generate from 
multivariate distributions. Thus, SYSTAT's M-H algorithm is useful when the target 
density is not necessarily log-concave. 

In MCMC algorithms, a suitable transition kernel K that satisfies the reversibility 
condition K(x,y)n(x) = (y)K(y,x) with a probability density function 7(x) is utilized 
to generate random variates from a stationary target function n(x). The Metropolis- 
Hastings algorithm (Chib and Greenberg, 1995; Gilks, Richardson, and Spiegelhalter, 
1998: Liu, 2001) is a type of MCMC algorithm that constructs a Markov chain by 
choosing a transition kernel (continuous-time analog of the transition probability 
matrix of the discrete case), 
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Kyu (x,y) = 90 Dala, y) += [aO AYE) 
Ri 


satisfying the reversibility condition q(yix)o(x,y)T(x)=q(xly)o(y,x)TX(y) and has target 
function T(.) as its stationary density function, where the acceptance probability is: 


VIVA ||, À 
a(x, y) min ari if m2) q(x|y) > 


1 if mx)¢(x|y) = 0 


The function q(.) is a proposal density function, 
l= [90109491 
K 


is the probability that the chain remains at x, and 6,(y) denotes a point mass at x. 


Since the target function n(x) appears both in the numerator and denominator of 

a(x, y), knowledge of the normalization constant for the target function is not needed. 
By selecting a suitable proposal density for the target function, a Markov chain is 
generated by the following Metropolis-Hastings algorithm. 


Metropolis-Hastings Algorithm 
Step 1. Generate y, from q(ylx) 
Step 2. Generate u from uniform(0,1) 
Step 3. If usyx,y,) then x“) = y; else x(t!) = x0 


Continue the above three steps until the required number of samples are generated. 
A sample produced by the Metropolis-Hastings algorithm may not be an independent 
identically distributed (i.i.d.) sample, because rejection of a proposed variate may 
produce a repetition of target variate x" at time (t+1). By selecting random variates 
with sufficiently large ‘gaps’ from the output, the M-H algorithm may result in 
infrequent occurrence of repeated values. There are several approaches to selecting 
proposal functions, resulting in specific types of M-H algorithms. SYSTAT provides 
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three such types: Random Walk Metropolis-Hastings algorithm, Independent 
Metropolis-Hastings algorithm, and a Hybrid Random Walk and Independent 
Metropolis-Hastings algorithm. 


Independent Metropolis-Hastings algorithm. Under a proposal distribution q(ylx) 
using an independence chain, the probability of moving a point y is independent of the 
current position x of the chain, i.e., q(ylx) = q(y). The acceptance probability can be 
written as (x,y) = min{w(y)/w(x), 1}, where w(x) = 7Xx)/q(x) can be considered a 
weight function that can be used in an important sampling process, if the generated 
variates are from q(ylx). It is suggested that the proposal be selected in such a way that 
the weight function is bounded. In that case, the generated Markov chain is uniformly 
ergodic. It must be ensured that the support of the proposal distribution covers the 
support of the target distribution. 


Random Walk Metropolis-Hastings algorithm. Under a proposal distribution q(ylx) 
using a random walk chain, the new value y equals the current value plus €,. A random 
variate €, follows the distribution q(.) and is independent of the current value, i.e., 
q(ylx) = q(y-x). The acceptance probability can be written as (x,y) = min{n(y)/n(x), 
1}, where the generated variates are from the proposal distribution. When the proposal 
density is continuous and positive around zero, the generated RWM-H chain is ergodic. 
It is recommended that a symmetric proposal distribution around zero be used. The 
performance of the algorithm depends on the scale of the proposal distribution. A 
proposal with small steps has a high acceptance rate, but the generated chain mixes 
very slowly. A proposal with large steps does not move frequently resulting in a low 
acceptance rate and slow mixing. The scale of the proposal should be chosen so that 
both of the above cases are avoided. If the range of the target function is finite, then 
boundary points can be treated as reflecting barriers of a random walk in the algorithm. 


Hybrid Random Walk and Independent Metropolis-Hastings algorithm. When there 
are two Markov chains with the same stationary distribution, a new Markov chain will 
be generated by choosing one of the chains with equal probability 0.5 at each step. 
Thus using Markov chains generated by random walk and independent Metropolis- 
Hastings algorithms by the above device, a new hybrid chain is generated. This has the 
advantage that, if for the given target function, even if one of the two chains is well- 
behaved, then the hybrid chain is well-behaved. 
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Gibbs Sampling 


Gibbs Sampling (Casella and George, 1992) is a special case of the Metropolis- 
Hastings algorithm. It deals with the problem of random sampling from a multivariate 
distribution, which is defined in terms of a collection of conditional distributions of its 
component random variables or sub-vectors, in such a way that this collection uniquely 
defines the target joint distribution. Gibbs sampling can be regarded as component- 
wise Metropolis-Hastings algorithm. The formulation of the defining marginal and 
conditional distributions often arise naturally in Bayesian problems in terms of the 
observable variable distributions and the prior distributions. These then give rise to a 
random vector which is the posterior distribution of the observable-cum-parameters. 
This posterior distribution is often difficult to work out analytically, necessitating the 
development of Monte Carlo procedures like Gibbs Sampling. SYSTAT’s Gibbs 
Sampling feature only handles the situation called the case of full conditionals, 
wherein the defining collection consists of the conditional distribution of each single 
component of the random vector given the values of the rest. SYSTAT considers only 


those cases where such conditional distributions are standard distributions available in 
its list. 


Gibbs Sampling Algorithm 


Let x = ( X1, Xp,..., Xp) 


Given x= (x1, x20,.... xp 0) 
generate x/(**!? from the conditional distribution f(x}! x30,..., x”), 


generate x, from the conditional distribution fox x, x,,..., ys 


. and 


generate xp") from the conditional distribution fp! HD mt... Xp +) 


The above Gibbs sampling procedure, Starting from x 0, mi, x 0) generates a 
‘Gibbs sequence’ x"), x,0...., xD... 119 x... x 0) IL Marons: This 
sequence is a realization of a Markov chain with a stationary distribution, ich is the 
unique multiyariat distribution defined by the full conditionals. Thus for large n, x q, 
Xgh casi Xp. can be considered as a random sample from the target multivariate 
distribution. The Gibbs Sampling method is also useful to approximate the marginal 
density f(x;), i=1,2,...,p of a joint density function £(X1,X9,...,Xp) or its parameters by 
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averaging the final conditional densities of each sequence-for that matter, marginal 
multivariate densities and their parameters. 


Integration 


In Monte Carlo Integration, an integral or equivalently an expectation of a random 
variable is approximated by the sample mean of a function of simulated random 
samples. Suppose a random sample {x}, x2,....Xn) is generated from a distribution with 
density f(x) of random variable X and h(X) is a function of X, then 


EANN = [hf Cd 


can be estimated by 
A 1 n 
I, = > h(x,) 


This is called Classical Monte Carlo Integration estimator. By the strong law of 
large numbers J, converges almost surely to E¡[h(X)]. The standard error of the 


estimate is 


2 
Lie i 
n(n—1) Sho, 


i=l 
Importance Sampling (Geweke, 1989; Hesterberg, 1995) is another way to estimate 
E;[h(X)] by drawing an independent sample x},X3,..-.Xy from a distribution of a given 


importance density g(x), with g(x) > 0 and A(x)f(x) #0 
The integral can be estimated by 


fy a CAI) 


where w(x;)=f(x;)/g(x;), i=1,2,....n, is a weight function, with standard error 
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2 
1 n 
x. AG, à E 
n(n-1) $| Cada 7 


The optimal importance density for minimizing the variance of integration estimator is 


yu Pao 
EL lod 


The integration estimate can also be computed by Importance sampling ratio estimate 


a 


I 


w 


$ nx Jm) 
a 
$ wa) 


ial 


and the corresponding standard error is 


Y n00-1, ] bee 
i=l 
5 2 
pa 


i=l 


The advantage of using the ratio estimate compared to the integration estimate is that 
in using the latter we need to know the weight function (i.e., ratio of target and 
importance functions) exactly, whereas in the former case, the ratio needs to be known 
only up to a multiplicative constant. If the support of importance function g(x) consists 
of the support of density function f(x), then the generated samples are i.i.d. and 
Importance Sampling estimator converges almost surely to the expectation. 


Monte Carlo Integration methods in SYSTAT are invoked only after generating a 
random sample using any one of the univariate discrete and univariate continuous 
random sampling methods, Rejection Sampling, Adaptive Rejection Sampling, or the 
M-H algorithms. SYSTAT computes a Monte Carlo integration estimate and standard 
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error. There is a choice between the Classical Monte Carlo Integration method and the 
two Importance Sampling methods. Classical Monte Carlo Integration estimate is 
evaluated with respect to the density function related to the distribution from which 
samples are generated. In the Importance Sampling Integration and Importance 
Sampling Ratio procedures, the importance function is the density related to the 
distribution from which samples are generated. 


Rao-Blackwellized Estimates with Gibbs Samples 


Let us consider a statistical problem of drawing inferences about a parameter O of the 
probability density function f(x10), based on a sample [x ,,x3,...,Xn). By the sufficiency 
principle, a statistic T is sufficient for O if the conditional distribution of the sample 
given the statistic does not involve the parameter O. By Rao--Blackwell theorem, the 
conditional expectation of the estimator given a sufficient statistic is an improved 
estimator; i.e., when O(X1,X2::-»Xp) is an estimator of O with finite variance, 

Var (EIS 1,X2»--.,Xp)! T=t]} Var (Ó(x1,X2».--:Xp)). Using a conditional expectation 
with respect to another estimator as an improved estimator is often called Rao- 
Blackwellization, even if the conditioning statistic is not a sufficient statistic. This 
leads to the use of the Rao-Blackwellized estimator 


instead of the empirical estimator 
1 n 
7 > es E ) 
tal 


in Gibbs Sampling method. See Liu et al. (1994), Robert and Casella (2004) for details. 
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Precautions to be taken in using IID Monte Carlo and MCMC features 


You may obtain absurd results if suitable inputs are not given in the case of IID Monte 
Carlo and MCMC algorithms. Some of the precautions to be taken are: 


m In Rejection Sampling, the output may not be appropriate if the target function’s 
support is not a subset of the support of the proposal function or if the target 
function is not dominated by a constant times the proposal density function. 


m Log-concavity of target function has to be checked by you before using the ARS 
algorithm. 


m If you get a sample that does not cover the entire parameter space of left or right 
bounded and bounded target functions using ARS algorithm, you should check 
whether the assigned starting points consist of corresponding bounded values. 


= In M-H algorithms, it is your responsibility to generate an ergodic Markov chain 
by selecting a suitable proposal density function. 


E You should ensure that the expectation of the integrand exists before doing Monte 
Carlo Integration. 


m The time required to generate samples using MCMC algorithms depends, among 


other factors, on the burn-in period and gap, and in some situations may be quite 
large. 


While using SYSTAT’s Gibbs Sampling algorithm to generate random samples from 
the distribution of a p-dimensional random vector X = (Xj, Xo, 


«== Xp), you should 
note that: 


m The input (defining conditional distributions) consists of only univariate 
distributions from SYSTAT’s list of distributions. 


M The input should give the marginal distributions of each X; given the rest of the 


components of X. 


The parameters of the conditional distributions have to be specified in the specified 
syntax, 


It is your responsibility to ensure that the above inputs satisfy the conditions 


required for them to define uniquely the joint distribution of the components of X 
as your target distribution. 
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Monte Carlo Methods in SYSTAT 


Random Sampling 


Before using the Random Sampling feature you should study the list of distributions, 
the form of the density functions, especially in respect of the parameters and the names 
and notations for the parameters, from the volume Data: Chapter 4: Data 

Transformations: Distribution Functions. It may also be useful to consult the references 
therein for the properties of these distributions and the meanings of the parameters. The 
distributions are divided into three groups---univariate discrete, univariate continuous 


and, multivariate. 


Univariate Discrete Distributions Dialog Box 


To open the Monte Carlo: Random Sampling: Univariate Discrete Distributions dialog 
box, from the menus choose: 


Addons 
Monte Carlo 
Random Sampling 
Univariate Discrete... 


Monte Carlo: Random Sampling: Univariate Discrete 


Number of samples: 


Sample size: 
Distribution 
[Discrete uniform 


Number of points (N): 
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Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the sample you want to generate. 


Random seed. The default random number generator is the Mersenne-Twister 


algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 
algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwise SYSTAT uses a 
seed based on system time. 


Distribution. Choose the distribution from the drop-down list. The list consists of nine 
univariate discrete distributions: Benford's Law, Binomial, Discrete uniform, 
Geometric, Hypergeometric, Logarithmic series, Negative binomial, Poisson, and 
Zipf. Enter the values of the parameters (depending on the distribution selected) in the 
boxe(es). 


Save file. You can save the output to a specified file. 


Univariate Continuous Distributions Dialog Box 


To open the Monte Carlo: Random Sampling: Univariate Continuous Distributions 
dialog box, from the menus choose: 


Addons 
Monte Carlo 
Random Sampling 
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Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the sample you want to generate. 


Random seed. The default random number generator is the Mersenne-Twister 
algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 
algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwise SYSTAT uses a 
seed based on system time. 


Distribution. Choose the distribution from the drop-down list.The list consists of 
twenty eight univariate continuous distributions: Beta, Cauchy, Chi-square, Double 
exponential, Erlang, Exponential, F, Gamma, Generalized lambda, Gompertz, 
Gumbel, Inverse Gaussian (Wald), Logistic, Loglogistic, Logit normal, Lognormal, 
Non-central chi-square, Non-central F, Non-central t, Normal, Pareto, Rayleigh, t, 
Smallest extreme value, Studentized range, Triangular, Uniform, and Weibull. Enter 
the values of the parameters (depending on the distribution selected) in the box(es). 


Save file. You can save the output to a specified file. 


Multivariate Distributions Dialog Box 


To open the Monte Carlo: Random Sampling: Multivariate Distributions dialog box, 
from the menus choose: 


Addons 
Monte Carlo 
Random Sampling 
Multivariate... 


16 


Monte Carlo 


Multivariate 


Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the sample you want to generate. 


Random seed.The default random number generator is the Mersenne-Twister 
algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 


algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwise SYSTAT uses a 
seed based on system time. 


Distribution. Choose the distribution from the drop-down list. The list consists of five 
multivariate distribitions: Bivariate exponential, Dirichlet, Multinomial, Multivariate 


normal, and Wishart. Enter the values of the parameters (depending on the distribution 
selected) in the box(es). 


Save file. You can save the output to a specified file. 


Using Commands 


For Univariate Discrete and Continuous random sampling: 


RANDSAMP 
SAVE filename 
UNIVARIATE distribution notation (parameterlist) /SIZE=n1 


NSAMPLE=n2 RSEED=n3 


The distribution notation consists of parameter values as its arguments. 
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For multivariate random sampling: 
RANDSAMP 
SAVE filename 


MULTIVARIATE distribution notation (parameterlist) / 
SIZE=n1 NSAMPLE=n2 RSEED=n3 


The distribution notation consists of parameter values as its arguments. 


Distribution Notations used in Random Sampling 


Distribution Name Distribution Parameter(s) 
Notation 
Benford’s law BLRN (B) 
Binomial NRN (n,p) 
Discrete uniform DURN (N) 
Geometric GERN (p) 
Hypergeometric HRN (N,m,n) 
Logarithmic series LSRN (theta) 
Negative binomial NBRN (k,p) 
Poisson PRN (lambda) 
Zipf ZIRN (shp) 
Beta BRN (shp1,shp2) 
Cauchy CRN (loc,sc) 
Chi-square XRN (dí) 
Double exponential (Laplace) DERN (loc,sc) 
Erlang ERRN (shp,sc) 
Exponential ERN (loc,sc) 
F FRN (dt1,df2) 
Gamma GRN (shp,sc) 
Generalized lambda GLRN (lambda1,lambda2,lambda3,lambda4) 
Gompertz GORN (b,c) 
Gumbel GURN (loc,sc) 
Inverse Gaussian (Wald) IGRN (loc,sc) 
Logistic LRN (loc,sc) 
Logit normal ENRN (loc,sc) 
Loglogistic LORN (logsc, shp) 
Lognormal LNRN (loc,sc) 


Non-central chi-square NXRN (df1,nc) 
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Distribution Name 


Non-central F 
Non-central t 
Normal 

Pareto 

Rayleigh 

Smallest extreme value 
Studentized range 

t 

Triangular 

Weibull 

Uniform 

Bivariate exponential 
Dirichlet 
Multinomial (k,p) 
Multivariate normal 
Wishart 


* + * + * 


Distribution 
Notation 


NFRN 
NTRN 
ZRN 
PARN 
RRN 
SERN 
SRN 
TRN 
TRRN 
WRN 
URN 
BERN 
DIRN 
MRN 
ZPRN 
WIRN 


Parameter(s) 


(df1,df2,nc) 
(df.nc) 
(loc,sc) 
(sc,shp) 
(sc) 
(loc,sc) 


(a,b,c) 

(sc,shp) 

(a,b) 

(lambda1, lambda2, lambda3) 
(k.P) 

(k,P,n) 

(p, mu, sigma) 

(p, df, sigma,c) 


where low is the smallest value and hi, the largest value; loc is the location parameter 
and sc, the scale parameter; shp is the shape parameter and thr, the threshold parameter, 
nc is the non-centrality parameter (for univariate non-central distributions), and df is 


the degrees of freedom. 


Note: * indicates multivariate distributions. 


Example: Normal random number with parameters (0, 1) 


RANDOMSAMP 


UNIVARIATE ZRN(0,1) 


For parameter description of multivariate distributions please refer section PDFs of 
Multivariate Distributions after Usage Considerations. 
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Rejection Sampling Dialog Box 


To open the Monte Carlo: IIDMC: Rejection Sampling dialog box, from the menus 
choose: 


Addons 
Monte Carlo 
IIDMC 
Rejection Sampling... 


Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the sample that you want to generate. 


Random seed. The default random number generator is the Mersenne-Twister 
algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 
algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwise SYSTAT uses a 


seed based on system time. 

Target function. Specify your target function in the required syntax. 

Constant. Enter the value that is an upper bound to supremum of ratio of target to 
proposal functions. 

Proposal. Select a suitable proposal distribution function. The list consists of twenty 
univariate continuous distributions: Beta, Cauchy, Chi-square, Double Exponential , 
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Exponential, F, Gamma, Gompertz, Gumbel, Inverse Gaussian , Logistic, Logit 
normal, Lognormal, Normal, Pareto, Rayleigh, t, Triangular, Uniform, and Weibull. 


Save file. You can save the output to a specified file. 


Adaptive Rejection Sampling Dialog Box 


To open the Adaptive Rejection Sampling dialog box, from the menus choose: 


Addons 
Monte Carlo 
IIDMC 


Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the sample that you want to generate. 
Random seed. The default random number generator is the Mersenne-Twister 


algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 


algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwi 
pis g m; otherwise SYSTAT uses a 


Target function. Speci : E t i 
brit) pecify your target function, which should satisfy the log concavity 
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Support of target. The method at first constructs a proposal function using initial 
points on the support of target distribution and extends it depending on the type of the 
target function. Bounds and starting points should be given. 


Unbounded. Specifies the support of target as unbounded. The two points are 
starting points. 


Right bounded. Specifies the support of target as right bounded. The left point is a 
starting point and the right one is a bound. 

Left bounded. Specifies the support of target as left bounded. The right point is a 
starting point and the left one is a bound. 

Bounded. Specifies the support of target as bounded. The left and right starting 
points are bounds. 

Left point/bound. Enter a point preferably to the left side of the mode of the target 
function. 

Right point/bound. Enter a point preferably to the right side of the mode of the 
target function. 


Save file. You can save the output to a specified file. 


Using Commands 


For the Rejection Sampling method: 


IIDMC 
SAVE FILENAME 
REJECT RJS(targetexpression, proposalexpression, constant)/ 
SIZE=n1 NSAMPLE= n2 RSEED=n3 


The distribution notation for the proposal expression can be chosen from the notations 
of Beta, Cauchy, Chi-square, Exponential, F Gamma, Gompertz, Gumbel, Double 
Exponential(Laplace), Logistic, Logit normal, Lognormal, Normal, Pareto, Rayleigh, 
t,Triangular, Uniform, Inverse Gaussian (Wald), and Weibull distributions. 


For the Adaptive Rejection Sampling method: 


TIDMC 
SAVE FILENAME 
REJECT ARS (targetexpression, rangeexpression)/SIZE=n1 
NSAMPLE= n2 RSEED=n3 
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Range expressions for target function in ARS command are listed in the following 


table: 


Range expression 
UB(a,b) 


LB(a,b) 


RB(a,b) 


BD(a,b) 


M-H Algorithm Dialog Box 


Description 
Specifies the target as unbounded. The two points are starting 
points. 

Specifies the target function as left bounded. The left point is a 
bound as well as starting point whereas the right point is a start- 
ing point only. 

Specifies the target function as right bounded. The right point is 
a bound as well as starting point whereas the left point is a start- 
ing point only. 


Specifies the support of target as bounded. The left and right 
starting points are also bounds. 


In the Monte Carlo: MCMC: Metropolis-Hastings algorithm, specify the particular 
type of algorithm you want to use: Random walk, Independent or hybrid. 


To open the M-H Algorithm dialog box, from the menus choose: 


Addons 
Monte Carlo 
MCMC 
M-H Algorithm... 
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Target functior: 
Suppl top TO linate | Random ak] =e) 


© A “| -Parameters 

i LEA ig tora | — 
tecno ce For om {fon B 
o $ bd; PROPRIA ER i a Ae Location or mean (mu): 


Scale or SD (sigma) 


Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the sample that you want to generate. 


Random seed. The default random number generator is the Mersenne-Twister 
algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 
algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwise SYSTAT uses a 


seed based on system time. 

Burn-in. Enter the size of random sample to be discarded initially from the chain. 
Gap. Enter the difference between the indices of two successive random observations 
that can be extracted from the generated sequence. 

Target function. Specify your target function. 

Algorithm type. Select the algorithm from the following: 

m Random walk. Generates random sample using RWM-H algorithm. 

m Independent. Generates random sample using IndM-H algorithm. 

m Hybrid RWInd. Generates random sample using Hybrid RWInd M-H algorithm. 
Support of target. Support of your target distribution can be specified as bounded, left 
bounded, right bounded, and unbounded. 

= Unbounded. Specifies the support of target as unbounded. 

m Right bounded. Specifies the support of target as right bounded. 
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Left bounded. Specifies the support of target as left bounded. 
Bounded. Specifies the support of target as bounded. 


Left bound. Enter the left bound to the target, which has its support as left 
bounded/bounded. 


m Right bound. Enter the right bound to the target, which has its support as right 
bounded/bounded. 


Initial value. Specifies initial value to the Markov chain to be generated. Select a 
distribution from the drop-down list that has the same support as that of the target 
distribution. 

Proposal. An appropriate proposal distribution should be selected from the list of 
twenty univariate continuous distributions: Beta, Cauchy, Chi-square, Double 
Exponential (Laplace), Exponential, F, Gamma, Gompertz, Gumbel, Inverse Gaussian 


(Wald), Logistic, Logit normal, Lognormal, Normal, Pareto, Rayleigh, t, Triangular, 
Uniform, and Weibull. 


= Random Walk. Select one from the given univariate continuous distributions list, 
when Random Walk Metropolis-Hastings algorithm or Hybrid Random Walk 
Independent Metropolis-Hastings algorithm is selected. When the proposal density 
is continuous and positive around zero, the generated RWM-H chain is ergodic. It 
is recommended that a symmetric proposal distribution around zero be used. 


Independent. Select one from the given univariate continuous distributions list, 
when Independent Metropolis-Hastings algorithm or Hybrid Random Walk 
Independent Metropolis-Hastings algorithm is selected. It is suggested that the 
proposal be selected in such a way that the weight function is bounded. In that case, 
the generated Markov chain is uniformly ergodic. It must be ensured that the 
support of the proposal distribution covers the support of the target distribution. 


Save file. You can save the output to a specified file. 


Gibbs Sampling Dialog Box 


P open the Monte Carlo: MCMC: Gibbs Sampling dialog box, from the menus 
choose: 


Addons 
Monte Carlo 
MCMC 
Gibbs Sampling... 
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Monte Carlo: MCMC: Gibbs Sampling 


Number samples [I_ | Samia sn: [1000 Fandan sed Geol | Buminfioo | 


[Use file 


Number of samples. Enter the number of samples that you want to generate. 
Sample size. Enter the size of the multivariate sample that you want to generate. 


Random seed. The default random number generator is the Mersenne-Twister 
algorithm. For the seed, specify any integer from 1 to 4294967295 for the MT 
algorithm and 1 to 30000 for the Wichmann-Hill algorithm; otherwise SYSTAT uses a 


seed based on system time. 

Gap. Enter the difference between the indices of two successive random observations 
that can be extracted from the generated sequence. 

Burn-in. Enter the size of random sample to be discarded initially from the chain. 
Use file. Open the data file, where variables in the data file are part of the parameter 
expressions of full conditionals. 

Full conditionals. Specify the full conditional distributions. 

m Variables. Enter the variable for which you want to generate random sample. 


m Distribution. Select the required distribution from the list provided. The list 
consists of seven univariate discrete and twenty univariate continuous 
distributions. They are Binomial, Discrete uniform, Geometric, Hypergeometric, 
Poisson, Negative Binomial, Zipf, Beta, Cauchy, Chi-square, Double Exponential 
(Laplace), Exponential, F, Gamma, Gompertz, Gumbel, Inverse Gaussian (Wald), 


Monte Carlo 


Logistic, Logit normal, Lognormal, Normal, Pareto, Rayleigh, t, Triangular, 
Uniform, and Weibull. 


Parameter. Specify the expression or number for the parameter related to the 
distribution. 


m Initial Value. Enter the initial value of each variable. 


Save file. You can save the output to a specified file. 
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Integration Dialog Box 


To obatin the Monte Carlo: Integration dialog box, from the menus choose: 
Addsons 


Monte Carlo 
Integration... 


Monte Carlo: Integration 


Integrand: 
Method 
© Classical Monte Carlo. 
O Importance sampling integration 


Oliena ng Re: 


RETTA 


Integrand. Specify the function for which integration estimate is required. 


Method, Select the integration method you want: 
m Classical Monte Carlo. Computes classical Monte Carlo integration estimate. 
= Importance sampling integration. Computes Importance Sampling integration 


estimate. 
= Importance sampling ratio. Computes Importance Sampling ratio estimate. 


Density function. Type your density function, which is the numerator of the weight 
function in the Importance Sampling method. 
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Using Commands 


For the Metropolis-Hastings Algorithm: 


MCMC 
SAVE FILENAME AU é 
MH MHRW (target exp, range exp , proposal exp, initialvalue exp) / 
SIZE= nl NSAMPLE=n2 BURNIN=n3 GAP=n4 RSEED=n5 
or TUR 
MHIND (target_exp, range_exp, proposal_exp, initialvalue exp) / 
T SIZE= nl NSAMPLE=n2 BURNIN=n3 GAP=n4 RSEED=n5 
or 
MHHY (target exp, range exp , rw proposal exp , 
ind proposal exp, initialvalue exp) / 
SIZE= nl NSAMPLE=n2 BURNIN=n3 GAP=n4 RSEED=n5 


Range expressions for target function in MH command are listed in following table: 


Range expression Description 


UB Specifies the target as unbounded, 

LB(a) Specifies the target function as left bounded at a. 

RB(b) Specifies the target function as right bounded at b. 

BD(a,b) Specifies the support of target as bounded between a and b. 


The distribution notation for the proposal can be chosen from the notations of Beta, 
Cauchy, Chi-square, Exponential, F, Gamma, Gompertz, Gumbel, Double Exponential 
(Laplace), Logistic, Logit normal, Lognormal, Normal, Pareto, Rayleigh, t, Triangular, 
Uniform, Inverse Gaussian (Wald), Weibull distributions. 


For the Gibbs Sampling method: 


MCMC 
USE FILENAME 


VARIABLE DECLARATIONS 
GVAR VARLIST 
FUNCTION fname_1() 


ras OF STATEMENTS 


FUNCTION fname_k ( ) 


LIST OF STATEMENTS 


GIBBS (fname_1(),..,fname_ k()) /SIZE=n1 NSAMPLE=n2 BURNIN=n3 
GAP=n4 RSEED=n5 
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or 
MCMC 
USE FILENAME 
VARIABLE DECLARATIONS 
GVAR VARLIST 
GIBBS (fcexpname_1,.., fcexpname k) /SIZE=nl NSAMPLE=n2 
BURNIN=n3 GAP=n4 RSEED=n5 


The distribution notations for full conditionals can be chosen from the notations of 
Binomial, Discrete Uniform, Geometric, Hypergeometric, Negative Binomial, 
Poisson, Zipf, Beta, Cauchy, Chi-square, Exponential, F, Gamma, Gompertz, Gumbel, 
Double Exponential (Laplace), Logistic, Logit normal, Lognormal, Normal, Pareto, 
Rayleigh, t, Triangular, Uniform, Inverse Gaussian (Wald) and Weibull distributions. 


For the Monte Carlo Integration method: 


INTEG expression ; MC or IMPSAMPI DENFUN=expression or IMPSAMPR 
DENFUN = expression 


Monte Carlo Integration methods in SYSTAT can be used only after generating random 
samples from any one of univariate discrete and univariate continuous random 
sampling methods, Rejection Sampling, Adaptive Rejection Sampling and M-H 
algorithms. 


Usage Considerations 


Types of data. Gibbs Sampling and Monte Carlo Integration use rectangular data only. 
For the remaining features no input data are needed. 


Print options. There are no PLENGTH options. 


Quick Graphs. Monte Carlo produces no Quick Graphs. You use the generated file and 
produce the graphs you want. For more information. refer examples. 


Saving files. Generated samples can be saved in the file mentioned. For all distributions 
(except Wishart) case number refers to observation number. For all univariate 
distributions column names are sl, 52, ...(number after s denotes sample number). For 
multivariate distributions, the format of the saved/output file is as follows: Column 
name format is “s*v*”, where * after s denotes sample number and * after v denotes 
variable number. For Wishart, the leading column “OBS. NO” with elements “o*v*”, 
where * after o denotes observation number and * after v denotes variable number. The 
output format of Rejection sampling, ARS and M-H algorithms are the same as the 


30 
Monte Carlo 


univariate distributions. For Gibbs Sampling, column name is the name of the variable 
with sample number. 


By groups. By groups is not relevant. 
Case frequencies. Case frequency is not relevant. 


Case weights. Case weight is not relevant. 
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Distribution Notations used in IIDMC and MCMC 


Rejection 
Distribution Sampling 
M-H Algorithm 

Uniform U 
Normal Zz 

t T 
F F 
Chi-square X 
Gamma G 
Beta B 
Exponential E 
Logistic L 
Studentized Range 
Weibull w 
Cauchy Cc 
Double Exponential DE 
Gompertz GO 
Gumbel GU 
Inverse Gaussian IG 
(Wald) 

Logit Normal EN 
Lognormal LN 
Pareto PA 
Rayleigh R 
Triangular TR 
Binomial 

Poisson 

Discrete Uniform 
Geometric 
Hypergeometric 
Negative Binomial 

Zipf 


Parameter 


(a,b) 
(loc,sc) 
(df) 
(df1,df2) 
(at) 
(shp,sc) 
(shp1,shp2) 
(loc,sc) 
(loc,sc) 
(k,df) 
(sc,shp) 
(loc,sc) 
(loc,sc) 
(b,c) 
(loc,sc) 


(loc,sc) 


(loc,sc) 
(loc,sc) 
(sc,shp) 
(sc) 
(a,b,c) 
(n,p) 
lambda 
(N) 

(p) 
(N,m,n) 
(k,p) 
(shp) 


Monte Carlo 


where low is the smallest value and hi, the largest value; loc is the location parameter 
and sc, the scale parameter; shp is the shape parameter and thr, the threshold parameter, 
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nc is the non-centrality parameter (for univariate non-central distributions), and df is 
the degrees of freedom. 


Note: In Gibbs sampling distribution notations are the same as in random sampling (all 
univariate distribution). 


PDFs of Multivariate Distributions 
Multinomial: 
Parameters: k, P, n 
k: number of cells (occurrences), 
P: probability vector for k cells 
n: number of independent trials , 
Note: n is optional; if not specified takes default value 1. 


Positive integers: n, k (>2), Cell probabilities p's, i=1,2,...,k, should add to | (p;'s in 
(0,1)) 


PDF: 


! k k 
PIN, =n,i=1,2,...k] = Elo Ho. n21,Yn=n 


II n, y Al i=] 


i=l 
Note: Case “k=2” is binomial distribution and “k=1” degenerate. 


Bivariate exponential: (Marshal-Olkind Model) 
Parameters:Ay, Ay, A 


A: positive real (Failure rate 1) 
À»: positive real (Failure rate 2) 


A: positive real (Failure rate 3) 


Monte Carlo 


PDF: 


ANA +A)F(x,,%;), for 0<x,<x, 
foto =s 4 (A, +4) F(x,,x,), for 0<x,<x, 
AED x =x=x>0 


where, 


F(x,x,))=expf-Ax -4,% — Ap max(x,,x,)}, for 0<x,,x, 
Note: A> positive real (Failure rate 3), sometimes denoted by Az. 


Dirichlet 


Parameters: k, P 
k : positive integer (>2) 


P: k dimensional vector of shape parameters (each component is positive re 


k 
PDF: Each x; is in [0,1] such that be? =] 


i=l 


A -l 
E © bc 


ae 


I Weyer 


ro») 
firo 


Note: Case k=2 is beta distribution. 


Multivariate normal: 


Parameters: p, mu, sigma 
p: positive integer (>1) 
mu: p x 1 vector of reals 


sigma: pxp symmetric positive definite matrix. 
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PDF: 


Ff) = 7)?" 


ap E p)) xe R” 


Wishart: 

Parameters:p, m, sigma, c 

p: positive integer(>1) 

m: positive integer (>=p)(degree of freedom) 
sigma: pxp symmetric positive definite matrix 
c : pxp matrix of non-centrality parameters. 


Let Y1’, Y27,....Ym be independent p variate normal with parameters (mu);, 
i=1,2,...,m, and same sigma, then: 


w= Y xy 


isl 


has non-central Wishart distribution with parameters (m, sigma, c) where 


c =(E(Y))*(E(¥))*Z" and Y is (m by p) matrix with ith row as Yi. 
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PDF: 


W=Y'*Y. Probability density of W matrix is given as 


fy (S) = w(p,Z,m, Ms" Apr AE: E es 


where M=E(Y), (m by p matrix) 


-1 
WoE mM) [TE] PT pl MM), 


T; E i «non +1-5)] 
i=l 


and ¿F, i is the hypergeometric function. 


Expressions in Monte Carlo 


For IIDMC and M-H algorithms, the target functions from which random sample is 
generated are expressions which involve mathematical functions of a single variable. 
The integrand in Monte Carlo Integration and the density function in Importance 
Sampling procedures are expressions. In the Gibbs Sampling method, the parameters 
of full conditionals are expressions, which may involve variables from a data file and 
mathematical functions. For construction of expressions you can use all numeric 
functions from SYSTAT’s function library. 
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Examples 


Example 1 
Sampling Distribution of Double Exponential (Laplace) Median 


This example generates 500 samples each of size 20 and investigates the distribution 


of the sample median by computing the median of each sample. 
The input is: 


RANDSAMP 
UNIVARIATE DERN(2,1) / SIZE=20 NSAMPLE=500 RSEED=23416 


Using the generated (500) samples, the distribution of sample median can be obtained. 
The input is: 


SSAVE 'STATS' 

STATISTICS S1 .. S500 /MEDIAN 
USE 'STATS' 

TRANSPOSE S1..S500 

VARLAB COL(1) / 'MEDIAN' 


CSTATISTICS COL(1) / MAXIMUM MEAN MINIMUM SD VARIANCE N 
SWTEST 


BEGIN 

DENSITY COL(1) / HIST XMIN=0 XMAX=4 
DENSITY COL(1) / NORMAL XMIN=0 XMAX=4 
END 


The output is as follows; COL(1) contains the sample medians. 
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SAMPLING DISTRIBUTION OF LAPLACE(2,1) MEDIAN 


04 


2 
pa 


Jeg sad uoniodolg 


R 


A 


0.0 


| MEDIAN 
meen aime e anne ner eps eect 
N of Cases i 500 
Minimum 4 51.3980 
Maximum i 2.665 
Arithmetic Mean 1. 1.994 
Standard Deviation i 0.248 
Variance i 0.062 
Shapiro-Wilk Statistic | 0.994 

+ 0,050 


Shapiro-Wilk p-value 


We observe that the sampling distribution of the double exponential sample median can 
be described to be normal. 


Example 2 
Simulation of Assembly System 


Consider a system having two parallel subsystems (A and B) connected in a series with 
another subsystem (C), as shown in the structural diagram. In such a system, work at 
"C" can start only after the work at "A" and "B" is completed. The process completion 
time for this system is the maximum of the processing times for "A" and "B" plus the 


processing time for "C". 


sk 
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Ae 


Assume that the system is a production line for a specific product, and that the 
processing time distributions for the three subsystems are independent. Let us specify 
that: 


A~ Exponential (0.2, 0.7) 
B~ Uniform (0.2, 1.2) 
C~ Normal (2, 0.3) 


The production engineer wants to find the distribution of manufacturing time and to 
estimate the probability that the manufacturing time is less than 5 units of time. 


The input is: 


RANDSAMP 

UNIVARIATE ERN (0.2, 0.7)/SIZE = 10000 NSAMPLE=1 
RSEED=123 

LET S2=URN (0.2, 1.2) 

LET S3=ZRN (2, 0.3) 

LET TIME=MAX (S1, S2) +S3 

DENSITY TIME / HIST 

LET PROB = (TIME <=5) 

CSTATISTICS PROB/MEAN 
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The output is: 


3000 nr 
2000 |- “i 
a 
& 
8 
1000 |- n 
o 
0 6 10 18° 
TIME 
PROB 
Arithmetic Mean | 0.979 


The output shows the histogram of 10000 simulated manufacturing times; the 
estimated probability that manufacturing time is less than 5 time units is 0.979. 


Example 3 
Generation of Random Sample 
Model) Distribution 


from Bivariate Exponential (Marshal-Olkin 


tudy the joint distribution of two specific electronic 
her prior knowledge she knows the mean failure 
as 1.3 (units). 


An electronics engineer wants to 5 


subsystems in her assembly. From 
time for the first subsystem as 1.2 (units) and for the second subsystem, 


If some strong shock occurs, then both of these subsystems fail. She also knows the 
mean occurrence time for this strong shock as 0.1 (units). Assuming the Marshal-Olkin 
model, realization of her assembly failures are carried out and the input is: 


(1,1,0.1) / SIZE = 10000 NSAMPLE=1 


RSEED=542375 
CSTATISTICS / MAXIMUM MEAN MINIMUM SD VARIANCE N 


PLOT S1V1*S1V2/ BORDER=HIST 
GRAPH OFF 


RANDSAMP 
MULTIVARIATE BERN 


CORR 
PEARSON S1V1 s1v2 
GRAPH ON 


40 
Monte Carlo 


The output is: 


N of Cases 
Minimum 
Maximum 8.336 8.163 
Arithmetic Mean 

Standard Deviation | 0.914 0.907 
Variance } 


Number of Observations: 10000 


Pearson Correlation Matrix 


| S1V1 $1v2 
Sst saa CA gd 
S1V1 | 1.000 
S1V2 | 0.047 1.000 
Example 4 


Evaluating an Integral by Monte Carlo Integration Methods 


nx 
This example explains the evaluation of f cos 2 


using Monte Carlo Integration 
methods. ` 


Using the Classical Monte Carlo Integration method, the integration can be evaluated 
by 
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P Es TA 
[== Y cos — |, 
nà 2 


where x; are generated from the uniform distribution on[0,1]. 
The input is: 


RANDSAMP 
UNIVARIATE URN(0,1)/ SIZE=10000 NSAMPLE=1 RSEED=76453782 


MCMC 
INTEG COS (3.14159*X/2); MC 


The output is: 
Classical Monte-Carlo Integration estimates for Sl 


Empirical Mean : 0.635 
standard Error: 0.003 


Importance Sampling, a variance reduction technique can be used to evaluate the given 
integration more accurately. An optimal importance function (3/2)(1-x?), which is 
proportional to cos(arx /2), can be used to estimate the above integral by 


a 14 1 
I ==» eos (1/2) as 
i nt (ml apace) 
Since (3/2)(1 2) is a log-concave function on (0, 1), the ARS algorithm can be used to 
generate random samples from this density and the input is: 


FORMAT 9,6 
IIDMC 


REJECT ARS((3/2)*(1-X°2) ,BD(0.0,1.0)) /SIZE=5000 


RSEED=76453782 


MCMC 
INTEG FUN='COS (PI*X/2) ' DENFUN='1' /IMPSAMPI 


FORMAT 


The output is: 
Importance Sampling Integration estimates for S1 


Weighted Mean: 0.636293 
Standard Error: 0.000449 


The Importance Sampling Integration estimate is an improved one compared to the 
Classical Monte Carlo Integration Estimate. 
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Example 5 
Rejection Sampling 


If cis an integer, random samples from gamma(o, B) can be generated by adding o. 
number of independent exponential()’s. But if cris not an integer, this simple method 
is not applicable. Even though we can generate random samples from gamma(o, B) 
using SYSTAT’s univariate continuous random sampling procedure, this example 
illustrates an alternative method, using Rejection Sampling by considering 
uniform(0,15), exponential(2.43), and gamma ([2.43],2.43/[2.43]) distributions 
(Robert and Casella, 2004) as proposals in different exercises. [2.43] is the integer part 
of 2.43. 


| i 
m Generating random sample from (T(2.43)) exp(—x) E (1 43) using the 
uniform density function as proposal, computing basic statistics from this sample 
and approximating E(X?) using Monte Carlo Integration method. 


The input is: 


TIDMC 
REJECT RJS((1/(EXP(LGM(2.43))))*EXP(-X)*X*1.43,U(0,15),4.7250) 


/SIZE=100000 NSAMPLE=1 RSEED=3245425 
MCMC 


INTEG X*2 ; MC 


CSTATISTICS S1 / MAXIMUM MEAN MINIMUM SD VARIANCE N 
DENSITY S1 / HIST 


' 


The output is: 


Classical Monte-Carlo Integration estimates for Sl 


Empirical Mean: 8.308 
Standard Error: 0.036 


i 51 
cr portal 
N of Cases + 100000 
Minimum | 0.020 
Maximum i 14.395 
Arithmetic Mean i 2.426 
Standard Deviation | 1.557 


Variance | 2.424 
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T(2.43))" exp(-2) x^ (1.43) using an 


m Generating a random sample from « 
exponential density function as ppposal, computing basic statistics from this 
sample and approximating E(X“) using Monte Carlo Integration method. 


The input is: 
IIDMC 
REJECT RJS ( (1/ (EXP (LGM (2.43) ) ) ) *EXP (-X) *X%1.43, E(0,2.43), 1.6338) 
/ SIZE=100000 NSAMPLE=1 RSEED=534652 


MCMC 


INTEG X^2;MC 
CSTATISTICS S1/ MAXIMUM MEAN MINIMUM SD VARIANCE N 


DENSITY S1 / HIST 


The output is: 
Classical Monte-Carlo Integration estimates for Sl 


Empirical Mean: 8.323 
Standard Error: 0.036 


i $1 
> parra 
N of Cases + 100000 
Minimum } 0.005 
Maximum i 15.603 
Arithmetic Mean 1 2.429 
Standard Deviation | 1.556 

y 2.422 


Variance 
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Count 
8 


m Generating random sample from 
a gamma density function as 


15 


o o 9 
a 8 S 
d uopodoig 


=> 
x 
jeg Ja 


and approximating E(X?) using Monte Carlo Integration method. 


The input is: 


IIDMC 
REJECT RIS ((1/ (EXP (LGM (2.43) ) ) ) *EXP(-X) *X*1.43, G(2, 1.2150) ,1 
/SIZE=100000 NSAMPLE=1 RSEED=236837468 


MCMC 
INTEG X*2 ; MC 


CSTATISTICS S1/ MAXIMUM MEAN MINIMUM SD VARIANCE N 


DENSITY S1 / HIST 


The output is: 


classical Monte-Carlo Integration estimates for Sì 


Empirical Mean: 8.339 
Standard Error: 0.036 


i sl 
ci op to nm ep rente 
N of Cases i 100000 
Minimum 1 0.008 
Maximum 1 15.376 
Arithmetic Mean E 2,430 


standard Deviation | 1.560 
Variance y 2.434 


(r(2.43)" exp(-2) 143) using 


proposal, computing basic statistics from this sample 


1102): 
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10000 io iii T— 7] 0.10 
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o 

2000 0.02 

1000 -0.01 

0 0.00 
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Random samples from the density function (r(2.43)" exp(=2) x(1.43) 


are generated using gamma, exponential and uniform distributions as proposals. But, 
the probability of accepting a sample from target function, 1/M varies with different 


proposals. 

Proposals Acceptance Probability 
Uniform 0.21164021 
Exponential 0.61207002 

Gamma 0.90073861 


The probability of accepting a proposal sample as a target variate depends on how close 
the product of proposal and constant is to the target function. Observe this by plotting 


the target function and product of proposal and constant together. 
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The input is: 


BEGIN 

FPLOT Y= GDF(X,2.43,1); XMIN=0 XMAX=10 YMIN=0 YMAX=0.4 HEIGHT=2 
WIDTH=2 LOC=0,0 TITLE='UNIFORM PROPOSAL' 

FPLOT Y= 4.7250*UDF (X,0, 15); XMIN=0 XMAX=10 YMIN=0 YMAX=0.4 
COLOR=1 HEIGHT=2 WIDTH=2 LOC=0,0 

FPLOT Y= GDF(X,2.43,1); XMIN=0 XMAX=10 YMIN=0 YMAX=0 .6 HEIGHT=2 
WIDTH=2 LOC=3,0 TITLE=' EXPONENTIAL PROPOSAL ' 

FPLOT Y= 1.63382*EDF (X,0, 2 .43); XMIN=0 XMAX=10 YMIN=0 YMAX=0.6 
COLOR=1 HEIGHT=2 WIDTH=2 LOC=3,0 

FPLOT Y= GDF (X,2.43,1); XMIN=0 XMAX=10 YMIN=0 YMAX=0.4 HEIGHT=2 
WIDTH=2 LOC=6,0 TITLE='GAMMA PROPOSAL' 

FPLOT Y= 1.1102*GDF (X,2,1 .2150); XMIN=0 XMAX=10 YMIN=0 YMAX=0.4 
COLOR=1 HEIGHT=2 WIDTH=2 LOC=6,0 

END 


The output is: 


UNIFORM PROPOSAL EXPONENTIAL PROPOSAL GAMMA PROPOSAL 


k ER È ; 


>02| >030| 


oo! 
8 10 0 g 4 6 g 


x 
Figure i Figure ii Figure iii 
When the uniform density function (Figure i) is considered as proposal, most of the 
generated points are outside the accepted region. In Figure ii (exponential) and Figure 
iii (gamma) the means of both target and proposal functions are the same, but when the 
gamma density function is taken as proposal (Figure iii), the product of constant and 
proposal is closer to the target function; thus, a generated point from proposal is 
accepted as a sample from target function with high probability and hence simulated 


values converge to theoretical values (mean=2.43, variance =2.43, and El X?]=8.3349) 
quickly. di si 
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Example 6 
Estimating Mean and Variance of a Bounded Posterior Density Function 
using RWM-H Algorithm and IndM-H Algorithm 


Let the observations {1,1,1,1,1,1,2,2,2,3} be from the (discrete) logarithmic series 
distribution with density 


x 


PO =a x=1,2,3,.. and 0<0<l 
x(-log(1-0)) 


From a sample of size 10, the logarithmic series distribution with parameter @ leads to 
(unnormalized) likelihood function, 


15 
[-log(1-9)]" 
Let the prior be t(0)=60(1-0). Then the posterior up to a multiplicative constant is 


L(0) = 


0 (1-8) for0<09<1. 
[-log(l-9)]" 


This example extracted from Monahan (2001) illustrates the generation of random 
samples from the specified posterior using the Random Walk Metropolis-Hastings 
algorithm and the Independent Metropolis-Hastings algorithm. 


To generate a random sample using the RWM-H algorithm, the selected proposal is 
uniform(-0.1, 0.1), which is symmetric around zero with small steps. Since the target 
function is bounded between 0 and 1, the value generated by the initial distribution 
should lie between 0 and 1 and thus the initial distribution is chosen as uniform(0,1). 
For getting samples from the posterior and computing its basic statistics, the input is: 


MCMC 7 
MH MHRW ( (x^16* (1-X) ) / ((-LOG (1-X) ) 10),BD(0,1), U(-0.1,0.1), 
U(0.0,1.0)) / SIZE=100000, 

NSAMPLE=1 BURNIN=500 GAP=30 RSEED=237465 


CSTATISTICS S1/ MAXIMUM MEAN MINIMUM SD VARIANCE N 
DENSITY S1 / KERNEL 
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The output is: 
i sl 
Meno e Dion ride 
N of Cases | 100000 
Minimum | 0.062 
Maximum 1 0.966 
Arithmetic Mean | 0,528 
Standard Deviation | 0.136 
Variance y 0.019 
TEA hela} VIN ee 


| 


do 01 02 03 04 05 06 07 08 09 16 
S1 


The mean and variance from the simulated data are 0.528 and 0.019 respectively. 


m When IndM-H is used, the support of the proposal should contain the support of 
the target function; hence the selected proposal in this example is uniform(0, 1). For 
generating random samples from the posterior and getting its mean and variance, 
the input is: 

MCMC 


MH MHIND((X*16*(1-X))/((-LOG(1-X))%10), BD(0,1), U(0.0,1.0), 
U(0.0,1.0)) / SIZE=100000, 


NSAMPLE=1 BURNIN=500 GAP=30 RSEED=65736736 


CSTATISTICS S1/ MAXIMUM MEAN MINIMUM SD VARIANCE N 
DENSITY S1 / KERNEL 
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The output is: 


N of Cases | 100000 
Minimum i 0.066 
Maximum i 0.966 
Arithmetic Mean | 10,527 
Standard Deviation | 0.137 
Variance i 0.019 


5009 TT 


Count 
sg 8 
ha I 
AA 


0 
0.0 01 02 03 04 05 06 07 08 09 1C 
S1 


The mean and variance of the posterior from simulated data obtained by RWM-H 
algorithm and IndM-H algorithm are approximately 0.527 and 0.019 respectively. 


Example 7 
Generating Bivariate Normal Random Samples by Gibbs Sampling Method 


This example explains the generation of a random sample from a bivariate normal 
distribution by iteratively generating univariate normal random samples. 


The input is: 


MCMC 

SAVE GIBBSBVN 
X1=0.0 

X2=0.0 


GVAR X1,X2 
GIBBS (ZRN(0.98*X2, 0.1990), ZRN(0.98*X1, 0.1990) )/SIZE=10000 


NSAMPLE=1 BURNIN=500 GAP=50, RSEED=231 
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The sample generated by the Gibbs Sampling method can be visualized through 
different SYSTAT graphs. The input is: 

USE GIBBSBVN 

PLOT x21*x11 / BORDER=HIST 


The output is: 


pes ER RS al 
2 


By iteratively generating samples from full conditional distributions, samples from a 
multivariate distribution are generated. The scatter plot shows the bivariate density 
function of X; and X,. The histograms on the border of the scatter plot are the 
univariate marginal distributions of X, and X3. 

Estimates of various parameters associated with the marginal distributions of X; 
and X; are obtained as descriptive statistics from the generated sample and also by 


Rao-Blackwellization as follows. 


The input is: 


LET RBEX11=0.98*X21 

LET RBEX21=0.98*X11 

CSTATISTICS/ MAXIMUM MEAN MEDIAN MINIMUM SD VARIANCE N, 
PTILE=2.5 50 97.5 
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The output is: 


i X11 X21 RBEX11 RBEX21 
icone A e ee 
N of Cases { 10000 10000 10000 10000 
Minimum | -3.680 -3.615 -3.542 -3.607 
Maximum I “3.985 IAS 3.652 3.906 
Median {0.014 0.008 0.007 0.014 


Arithmetic Mean 
standard Deviation | 


i 

| 

Variance i 

Method = CLEVELAND | 
2.500% | -1.931 -1.937 -1.899 -1.893 
50.000% 1 0.014 0.008 0.007 0.014 
97.500% T 31929 1.918 1.880 1.890 


It can be noticed that the Rao-Blackwellized estimates close to the true values and also 
that their variances are smaller than those of the naïve estimates. 


Example 8 
Gene Frequency Estimation 


Rao (1973) illustrated maximum likelihood estimation of gene frequencies of O, A, 
and B blood groups through the method of scoring. McLachlan and Krishnan (1997) 
used the EM algorithm for the same problem. This example illustrates Bayesian 
estimation of these gene frequencies by the Gibbs Sampling method. Consider the 
following multinomial model with four cell frequencies and their probabilities with 
parameters p, q, r with p+g+r=l. Letn=n,+ Nat Ng+NAB: 


Data Model 
no=176 É 
na=182 p?+2pr 
ng=60 q+2gr 
nap=!7 2pq 


Let us consider a hypothetical augmented data for this problem to be no; NAA, DAO,NBB» 


nap with a multinomial model (n; (1-p-)>, p?, 2p(1-P-4)s q”, 2q(1-p-q), 2pa}. 


np y p . ` 
a ngg could be considered as missing data. 


With respect to the latter full model, ny A > 


MODEL: Lei o 
x ~ Multinomialg (435; (1-p-4)% P , 2p(1-p-q), q^, 2q(1-p-9), 2pq) 


Prior information: 


(p, 4.0) ~ Dirichlet (o, B, Y) 
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The full conditional densities take the form: 


2 
ptt P 
n,, ~ Binomial ni-s = T 
g | 5 s'e 
q 
ña ~ Binomial Rs, 3] 
ci | à an 


p-(1-q)Beta(2n,, +n,0+Ngt0, 2h00tNuo +n50+7) 
q~ (1- p)Beta(2n,, +50 FM yn +Ê, oo tN4o +o +7) 


For generating random samples from p and q, the generated value from the beta 
distribution is to be multiplied with (1-9) and (1-p) respectively. Since it is not possible 
in our system to implement this, let us consider 


p~ Beta(2n ,¿+M 0 +N ¡y +0, 290 + Myo tgo +Y) 
q~ Beta(2n,; +My +N yg +B, 200 FM po + Ngo + Y) 


and whenever p and q appear in other full conditionals, p is replaced by (1-q)p and q is 
replaced by (1-p)q. By taking 0=2, B=2, and y=2, the input is: 


FORMAT 10, 5 
MCMC 

NAA=40 

NBB=5 

P=0.1 

Q=0.5 

N1=182 

N2=60 

GVAR NAA,NBB,P,Q 
FUNCTION FC1() 


P1=(((1-0)*P)%2)/((((1-0)*P)*%2)+(2* ( (1-Q) *P - -Q) *P) - - 
sg + Q)*P)*(1-((1-0)*P)-((1 
NAA=NRN (N1, P1) 


FUNCTION FC2() 


P2=(((1-P)*Q)*2)/((((1-P)*Q)*2 5 + (1- *0) - s 
SH Q) 2)+(2*((1-P)*Q)*(1-((1-P)*Q)-((1 


NBB= NRN (N2, P2) 
FUNCTION FC3() 


B1=NAA+182+17+1 


53 


Monte Carlo 


B2= (2*176)+182+60-NAA-NBB+1 
P=BRN (B1,B2) 


) 
FUNCTION FC4 () 


{ 

D1=NBB+60+17+1 
D2=(2*176)+182+60-NAA-NBB+1 
Q= BRN(D1,D2) 


SAVE GIBBSGENETIC 
GIBBS (FC1() , FC2() ,FC3(),FC4()) / SIZE=10000 NSAMPLE=1 
BURNIN=1000 GAP=1 RSEED=1783 


USE GIBBSGENETIC 

LET PP=(1-Q1)*P1 

LET QQ=(1-P1)*Q1 

LET RR=1-PP-Q0 

LET RBEP= (1- 

QQ) * ( (NAA1+182+17+2) / ( (NAA1+182+17+2) + ((2*176) +182+60-NAA1- 
NBB1+2))) 

LET RBEQ=(1- 

PP) * ((NBB1+60+17+2) / ( (NBB1+60+17+2) + ((2*176) +182+60-NAA1- 
NBB1+2))) 

LET RBER=1-RBEP-RBEQ 

CSTATISTICS PP QQ RR RBEP RBEQ RBER/ MAXIMUM MEAN, MEDIAN MINIMUM 
SD VARIANCE N PTILE=2.5 50 97.5 


BEGIN 
DENSITY PP RBEP/HIST XMIN=0.20 XMAX=0.35 LOC=0,0 


DENSITY QQ RBEQ/HIST XMIN=0 .05 XMAX=0.13 LOC=0, -3 
DENSITY RR RBER/HIST XMIN=0.60 XMAX=0.75 LOC=0,-6 


END 
FORMAT 
The output is: 
i PP QQ RR RBEP RBEO 
MERI ds Sy tes PE A ad 
N of Cases H 10000 10000 10000 10000 10000 
Minimum |! 0.19704 0.06190 0.60542 0.24135 0.08475 
Maximum | 031144 0.12549 0.70509 0.28952 0.10733 
Median | 0.25359 0.08980 0.65584 0.26456 0 09552 
Arithmetic Mean | 0.25407 0.09003 0.65589 0.26470 0.09564 
Arithmetlo raeson | 0.01622 11 0-099411 R0 OSES 00700 0.00296 
Variance { 0.00026 0.00009 0.00022 0.00005 0.00001 
Method = CLEVELAND | 
2.50000% | 0.22335, 0.07230 0.62616 0.25075 0 09009 
50.00000% H 0.25359 0.08980 0.65584 0.26456 0.09552 
97.50000% | 0.28793 0.10883 0.68477 0.27891 0 10165 


ooo 


3 10000 
' 


i 93 
i 619 
H 965 
lean 10. 66 
Standard Deviation | 0.00732 
hod = CLEVELAND | 
2.50000% | 0.62487 
50.00000% | 0.63965 
97.50000% | 0.65386 
1000, 0.10 2000, 02 
900) 
800] 008 
Pei 1500) 
A bf 
E jo Li 
400) 0.0 E E 
so g 
200 | ooz i l 
to 
di hh | | 
O E hh 
PP REEP 


i Di: 


805608 007 008 00 0 
‘REED 
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Maximum likelihood estimates of p, q and r evaluated by the scoring method or the 
EM algorithm are 0.26444, 0.09317 and 0.64239. With the available prior information, 
the estimates of p, q and r are approximated by the Gibbs Sampling method. The 
empirical estimates of p, q and r are 0.25407, 0.09003 and 0.65589 respectively. Rao- 
Blackwellized estimates are 0.26470, 0.09564 and 0.63966 respectively. 


Example 9 
Fitting Poisson Gamma Hierarchical Model 


This example concerns the finding of failure rates of 10 pumps in a nuclear power 
plant. The data set consists of the number of failures and the times of operation for 10 
pump systems at the nuclear power plant. The data set is from Gaver and 


O’ Muircheartaigh (1987). 


MODEL: 
Assume that the number of failures F ~ Poisson (Ajt;), where À; is failure rate and t; 


is the time of the operation for each pump. The prior densities are 
A, ~ gamma(a, B)and B ~ gamma(y, ô) 


with 0=1.8, y=0.01 and è=1. The full conditional densities take the form 


A | But, F ~ Gamma(F +08 (1, + BY”) 


COMINO 
B|A:4,Ao ~ Gamma oc (5+Y4) 


For getting random sample from the full conditionals and basic statistics, the input is 


FORMAT 10, 5 
USE PUMPFAILURES 
MCMC 
LAM1=0. 
LAM2=0. 
LAM3=0. 
LAM4=0. 
LAM5=0. 
LAM6=0. 
LAM7=0. 
LAM8=0. 


ANNANN 


56 


Monte Carlo 


LAM9=0.5 
LAM10=0.5 
BETA=1.0 


GVAR LAM1, LAM2, LAM3, 


LAM10, BETA 
FUNCTION FC1() 


Lab GEN a 
E FC2() 
LAM2=GRN (DATA (F, 2)+1.8 
FUNCTION FC3() 
LAM3=GRN (DATA (F,3)+1.8 
lenor FC4() 
SIRO: ATRE a 
FUNCTION FC5 () 
LAM5=GRN (DATA (F, 5) +1.8 
FUNCTION FC6 () 
A da 
FUNCTION FC7 () 
LAM7=GRN (DATA (F, 7) +1.8 
FUNCTION FC8 () 
LAM8=GRN (DATA (F, 8) +1.8 
FUNCTION FC9() 
LAM9=GRN (DATA (F, 9) +1.8 
FUNCTION FC10() 


LAM10=GRN (DATA (F,10)+1. 


, 


1 


Ù 


, 


, 


, 


, 


8 


LAM4, LAMS, LAM6, LAM7, LAM8, 


1/(DATA(T,1)+BETA)) 


1/(DATA(T,2)+BETA)) 


1/(DATA(T,3)+BETA)) 


1/(DATA(T,4)+BETA)) 


1/(DATA(T,5)+BETA)) 


1/(DATA(T,6)+BETA)) 


1/(DATA(T,7)+BETA)) 


1/(DATA(T,8)+BETA)) 


1/ (DATA (T, 9) +BETA) ) 


+ 1/ (DATA (T, 10) +BETA) ) 


LAM9, 
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FUNCTION FC11() 


BETA=GRN ((10*1.8)+0.01,1/(1.0+SUM(LAM1,LAM2,LAM3,LAM 
ra A E a ; + ; 4 , LAM5, LAM6, L 


SAVE GIBBSNUCLEARPUMPS 
GIBBS(FC1(),FC2(),FC3(),FC4(),FC5(),FC6(),FC7(),FC8(),FC9(),FC10( 
),FC11()) / SIZE=10000 NSAMPLE=1 BURNIN=500 GAP=30, | i 
RSEED=746572365 

USE GIBBSNUCLEARPUMPS 

CSTATISTICS / MAXIMUM MEAN, MEDIAN MINIMUM SD VARIANCE N PTILE=2.5 


50 97.5 
The output is: 
LAM11 LAM21 LAM31 LAM41 LAM51 
N of Cases 10000 10000 10000 10000 10000 


0.01045 0.00283 0.01681 0.03883 0.03731 


Minimum 
Maximum 0.20646 0.67833 0.31899 0.27406 2.15602 
Median 0.06688 0.13669 0.09991 0.12071 0.58231 


0.07024 0.15379 0.10456 0.12331 0.62866 
0.02682 0.09199 0.03971 0.03135 -0,29212 
0.00072 0.00846 0.00158 0.00098 0.08533 


Arithmetic Mean 
Standard Deviation 


' 
i 
+ 
i 
i 
i 
i 
' 
i 
i 
i 
i 
' 
i 
' 
i 
i 
' 
! 
' 
i 


Variance 
Method = CLEVELAND 
2.50000% 0.02811 0.02798 0.04097 0.06965 0.18876 
50.00000% 0.06688 0.13669 0.09991 0.12071 0.58231 
97.50000% 0.13234 0.38348 0.19548 0.19103 1.31557 
LAM61 LAM71 LAM81 LAM91 LAM101 
N of Cases 10000 10000 10000 10000 10000 
Minimum 0.24171 0.01499 0.02261 0.07778 0.72224 
Maximum Li 4.43899 4.42489 4.63179 3.52014 
Median 0.60204 0.71832 0.70962 1.20703 1.81756 


| 
+ 
t 
| 1.26120 
! 
! 
i 
' 


0.61282 0.84003 0.82617 1.30140 1.84679 
0.13603 0.54447 0.52776 0.58168 0.39485 
0.01850 0.29645 0.27853 0.33836 0.15591 


Arithmetic Mean 
Standard Deviation 


Variance 
Method = CLEVELAND 
2.50000% 0.37598 0.15513 0.14817 0.43988 1.15991 
50.00000% 0.60204 0.71832 0.70962 1.20703 1.81756 
97.50000% 0.90423 2.20541 2.13166 2.67503 2.69876 
cal + 
N of Cases t 10000 
Minimum 1 0.73295 
Maximum 1 6.42968 
Median 1 2.39187 
Arithmetic Mean | 2.46662 
Standard Deviation | 0.71612 
Variance | 0.51283 
Method = CLEVELAND 
2.50000% | 1.33448 
50.00000% | 2.39187 
97.500008 1 4.09004 
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Example 10 
Fitting Linear Regression using Gibbs Sampler 


This example taken from Congdon (2001) illustrates a Bayesian Linear Regression of 
December rainfall on November rainfall based on data for ten years. The data set is 
from Lee (1997), where Y is December rainfall and X is November rainfall. 


MODEL: Er 
Assume that Y; ~ Normal (0; , ©), where 0;= A+ Bl, — x) 
Priors: 
a ~ Normal (W,,0,) 


B ~ Normal (u,,0,) 
T=0 7? ~ Gamma (y,6") 


The full conditional densities take the form 


| 


(n/o')+(1/07) > (n/o')+(1/0º) 


a| B,o” ~ Normal 


B|a,0* ~ Normal (4/07)+(X,@-»)/0°) j l 


-2 n lan 
o” |a, ~ Gamma ath l ; +5 
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By taking prior distribution parameters as H1=0, H=0, 6;=10000 , œ=1000, y =0.001 
and 80.001, getting random samples from the full conditionals and computing basic 
Statistics, the input is: 


MCMC 
USE RAINFALL 
ALPHA=29 

BETA=0.5 

TAU=0.5 

GVAR ALPHA, BETA, TAU 
io co CF1() 


ALPHA=ZRN(((0/(10000))+((CSUM(Y))*(TAU)))/((10*(TAU 
SOR (1/( (10* (TAU) ) +(1/10000) ))) 44 REAR A 10000)), 


FUNCTION CF2() 


BETA=ZRN(((0/(1000))+((CSUM(Y*(X-CMEAN(X))))*TAU))/(((CSUM((x- 
CMEAN (X) ) ^2) ) *TAU) +(1/1000) ) ,SQR(1/(((CSUM((x- 
CMEAN (X) ) ^2) ) *TAU)+(1/1000)))) 


[oe CF3 () 


TAU=GRN ((10/2)+0.001, 1/(( (1/2) * (CSUM( (Y-ALPHA- (BETA* (X- 
prane n 

SAVE GIBBSYORKRAIN 

GIBBS ( CF1(),CF2(),CF3()) /SIZE=10000 NSAMPLE=1 BURNIN=1000 GAP=1 


RSEED=53478 

USE GIBBSYORKRAIN 

LET SIGSQ=1/TAU1 

CSTATISTICS ALPHA1 BETA1 SIGSQ/MAXIMUM MEAN, MEDIAN MINIMUM SD 


VARIANCE N PTILE=2.5 50 97.5 


The output is: 
ALPHAL BETAL SIGSQ 
N of Cases O DOS 10000 10000 10000 
Minimum 17.23346 -0.86588 42.89717 
e 63.37492  0.63053  2529.65534 
40.57834 -0.16655 209.43770 


Arithmetic Mean 


Standard Deviation 


+ 
1 
Median ! 3 la a 
i 40.56741 -0.16385 254.30380 
H . 0. 172. 
| 25.04786 0.01958 2.98421E+004 
i 
i 


Variance 

Method = CLEVELAND 
2.50000% 30.43290 -0.44318 88.06745 
50.00000% 40.57834 -0.16655 209.43770 
97.50000% 50.49352 0.11603 723.58007 


The estimates of O, B and & from simulated values are respectively 40.56741, 
-0.16385 and 254.30380. Using Time Series plots, we can study the behavior of the 


simulated samples. 
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The input is: 


SERIES 
TPLOT ALPHA1 


Series Plot 


= 


iù ale: 1 fr ee ES ee | 
2000 4000 6000 8000 10000 12000 
Case 
TPLOT BETA1 
Series Plot 
1.0 Ror: T T Ei) T T 
05 4 
x 
H 00 = 
Lu 
a 
05 BI 
Agi a E, dy 
0 2000 4000 6000 8000 10000 12000 


Case 
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TPLOT SIGSQ 
Series Plot 
3000 — DE = = æl 
9 2000 
D 
1000 
0 
0 2000 4000 6000 8000 10000 12000 
Case 


By posterior predictive simulation, the December rainfall can be predicted based on the 
new November rainfall 46.1. 


The input is: 


LET THETANEW= ALPHA1+BETA1* (46.1-57.8) 


LET YNEW= ZRN(THETANEW, SOR(SIGSO)) 
CSTATISTICS YNEW/MAXIMUM MEAN MEDIAN MINIMUM SD VARIANCE N 


PTILE=2.5 50 97.5 


The output is: 


i YNEW 
ca PI tS 
N of Cases i 10000 
Minimum 1 -49.074 
Maximum | 137.631 
Median t 42.547 
Arithmetic Mean | 42.467 
Standard Deviation | 16.871 
Variance 1 284.639 
Method = CLEVELAND | 

2.500% i 8.341 
50.000% 1 442.547. 
97.500% 1 76.677 


The prediction of December rainfall is 42.567 with standard deviation 16.871. 
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Computation 


Algorithms 


Algorithms used here for random sampling from specified distributions may be found 
in Devroye (1986), Bratley et al. (1987), Chhikara and Folks (1989), Fishman (1996), 
Gentle (1998), Evans et. al. (2000), Karian and Dudewicz (2000), Ross (2002), and 
Hörmann et al. (2004). For some distributions, the inverse CDF method (analytical or 
numerical) is used whereas for others special methods are used. The Adaptive 
Rejection Sampling method uses the algorithm developed by Gilks (1992), Gilks and 
Wild (1993), and Robert and Casella (2004). 
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RANDSAMP: 
Random Sampling 


RANDSAMP module generates random samples from a specified distribution with 
specified parameters. The distribution that is specified may be one of 42 UNIVARIATE 
discrete, UNIVARIATE continuous, and MULTIVARIATE distributions. The random 
samples drawn can be used for the desired Monte Carlo integration computations using 
INTEG Classical Monte Carlo Integration and Importance Sampling methods are not 
Random Sampling procedures, but they estimate (intractable) integrals through 
empirical sampling (refer INTEG command in MCMC). 


Setup: 
Univariate discrete and continuous Multivariate 
* RANDSAMP * RANDSAMP 
SAVE SAVE 
HOT * UNIVARIATE HOT* MULTIVARIATE 
Note: GENERATE command used in the previous version is no more required. 
UNIVARIATE command 


UNIVARIATE distribution notation(parameter list) 


Specifies a distribution with corresponding parameter values as arguments. 
f the random sample to be generated, The default is 1. 


| SIZE=n1 size 0 
mns) each of a specified size to be generated. 


NSAMPLE=n2 number of samples (colu 
The default is 1. 


RSEED=n3 random seed. 


butions along with distribution notation is given in section 


Note: The list of distri 
Random Sampling of Monte Carlo. 


Distribution Notations used in 
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Examples: 
For the normal distribution with parameters location = 0 and scale =1.2 
UNIVARIATE ZRN(0,1.2) 


For the binomial distribution with n=16 and p=0.5 


UNIVARIATE NRN(16, 0.5) 


MULTIVARIATE command 


MULTIVARIATE distribution notation(parameter list) 
Specifies a distribution with corresponding parameter values as arguments. 


I SIZE=n1 size of the random sample to be generated. The default is 1. 


NSAMPLE=n2 number of samples (columns) each of a specified size to be generated. 
The default is 1. 


RSEED=n3 random seed. 


Examples: 


For bivariate exponential distribution with parameters (1,1,0.5) 


MULTIVARIATE BERN(1,1,0.5) 


Note: The list of distributions along with distribution notation is given in section 
Distribution Notations used in Random Sampling of Monte Carlo. 


SAVE command 


SAVE filename 


Saves generated samples to a file specified with name filename, 
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IIDMC: 
IID Monte Carlo 


Setup: 


REJECT command 


IIDMC consists of two random sampling procedures: Rejection Sampling and Adaptive 
Rejection Sampling. The random samples drawn can be used for the desired Monte 
Carlo integration computations. The Classical Monte Carlo Integration and Importance 
Sampling methods are not random sampling procedures, but they estimate (intractable) 
integrals through empirical sampling (refer INTEG command in MCMO). 


* HDMC 


Note: GENERATE command used in the previous version is no more required. RJS and 
ARS commands are functions under REJECT command. PROPOSAL command is also 
not required, since it is the second argument of RJS function. 


Specifies rejection sampling (RJS) or adaptive rejection sampling (ARS) methods to be 


used for random sample generation. 


REJECT RJS(targetexpression, proposalexpression, constant) 


Specifies a target function in expression form for which random samples are needed. 
‘constant’ is an upper bound to supremum of ratio of target and proposal functions and 


should be a positive real number. 


tributions along with distribution notation is given in section 


Note: The list of dis 
d MCMC of Monte Carlo. 


Distribution Notations used in IIDMC an 


or 
REJECT ARS(targetexpression, rangeexpression) 


i ion i i i les are needed 
Specifies a target function in expression form for which random samp € 
and support of | target distribution with starting points. The starting points are to be in 


ascending order. 
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Range expressions for target expression in ARS function are listed in the following 


table: 

Range expression Description 

UB(a,b) Specifies the target as unbounded. The two points are starting 
points. 

LB(a,b) Specifies the target function as left bounded. The left point is a 


bound as well as starting point whereas the right point is a start- 
ing point only. 


RB(a,b) Specifies the target function as right bounded. The right point is 
a bound as well as starting point whereas the left point is a start- 
ing point only. 

BD(a,b) Specifies the support of target as bounded. The left and right 


starting points are also bounds. 


| SIZE=n1 size of the random sample to be generated. The default is 1. 
NSAMPLE=n2 number of samples (columns) each of a specified size to be generated. 
The default is 1. 
RSEED=n3 random seed. 
Example: 


For rejection sampling from target function "(x41.7)*((1-x)A5.3)" usi sal 
uniform(0, 1) and constant 0.0206 y*((1-x) )" using proposa 


REJECT RJS((x11.7)*((1-x)A5.3), 0.0206) 
For adaptive rejection sampling from unbounded target function "exp(-x2/2)" using - 
1 and 1 as starting values 


REJECT ARS( EXP(-x12/2), UB( -1, 1)) 


SAVE command 


SAVE filename 


Saves generated samples to a file specified by name filename. 
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MCMC: 
Markov Chain Monte Carlo 


Setup: 


MH command 


MCMC generates random samples from a target distribution by constructing an ergodic 
Markov chain. The M-H Algorithm and Gibbs Sampling method are two types of 
MCMC algorithms. Three types of M-H Algorithm: RWM-H, IndM-H and Hybrid 
RWInd M-H algorithms are provided. Fixed Scan Gibbs Sampling iteratively generates 
random samples from full conditionals. Monte Carlo Integration methods are not 
applicable to Gibbs Sampling procedure in SYSTAT. 


M-H Algorithm Gibbs Sampling 
* MCMC * MCMC 
SAVE USE 
HOT * MH SAVE 
HOT * INTEG * VARIABLE DECLARATIONS 
* GVAR 
FUNCTION 
HOT* GIBBS 


Note: GENERATE command used in the previous version is no more required. 
INITSAMP, PROPOSAL commands are arguments of various M-H algorithm functions 
under MH command. You can use VARIABLE DECLARATIONS, GVAR and FUNCTION 
commands to specify full conditionals in (FULLCOND command in the previous — 
version). GIBBS sampling algorithm. ESTIMATE command in Monte Carlo integration 


in the previous version is no more required. 


MH MHRW(target_exp, range_exP » proposal exp, initialvalue_exp) 
or 

MHIND(target exp, range_eXP; 
or 

MHHY(target exp, range_ex| 


proposal exp, initialvalue exp) 


pow, proposal. exp , ind. proposal exp, initialvalue. exp) 


orithms to be used for random sample generation. Function names 


Specifies various alg 
ss f M-H algorithms. 


indicate various types O 
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Function name Algorithm 

MHRW M-H random walk 

MHIND M-H independent 

MHHY M-H hybrid of MHRW and MHIND 
target_exp 


Specifies a target function in expression form from which random samples are needed. 


Range expression Description 


UB Specifies the target as unbounded. 

LB(a) Specifies the target function as left bounded at a. 

RB(b) Specifies the target function as right bounded at b. 

BD(a,b) Specifies the support of target as bounded between a and b. 


proposal exp or initialvalue exp 


The distributions for proposal or initial values can be chosen from Beta, Cauchy, Chi- 
square, Exponential, F, Gamma, Gompertz, Gumbel, Double Exponential (Laplace), 
Logistic, Logit normal, Lognormal, Normal, Pareto, Rayleigh, t, Triangular, Uniform, 
Inverse Gaussian (Wald) and Weibull distributions. 


Note: The list of distributions along with distribution notation is given in section 
Distribution Notations used in IIDMC and MCMC of Monte Carlo. 


/ SIZE=n1 size of the random sample to be generated. The default is 1. 
NSAMPLE = n2 number of samples (columns) each of a specified size to be generated. 
The default is 1. 
BURNIN = n3 size of random samples to be discarded initially from the chain. The 
default is 500. 
GAP = n4 


the difference between two successive samples that are extracted from the 
generated sample. The default is 30. 


RSEED = n5 random seed. 
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Examples: 


To generate random samples using M-H random walk algorithm from unbounded 
target function "exp(-(x/2)/2)" with Cauchy(0,1) as the proposal distribution and an 
initial value from uniform(-1,1) distribution: 

MH MHRW(exp(-(x"2)/2),UB,C(0,1),U(-1,1)) 

To generate random samples using M-H independent algorithm from bounded target 
function "((x+2)9125)*((1-x)438)*(x434)" with uniform(0,1) the proposal distribution 
and an initial value from uniform(-1,1) distribution: 


MH MHIND(((x+2)4125)*((1-x)438)*(x434),BD( 0, 1), U(0,1), U(0,1)) 


INTEG command 


INTEG expression 


Specifies an integrand function for Monte Carlo integration. 


; MC classical Monte Carlo integration. 
IMPSAMPI DENFUN = expression Importance Sampling integration method 
IMPSAMPR DENFUN = expressionImportance Sampling ratio method 


Expression after DENFUN is the numerator of weight function in Importance Sampling 
method. 


Examples: 


For classical Monte Carlo integration of integrand “x2” 


INTEG x2 ; MC 


For Importance Sampling integration of integrand “COS((x)/2)” with weight function 1. 


INTEG COS((X)/2) ; IMPSAMPI DENFUN=1 


Note: Monte Carlo Integration methods in SYSTAT can be used only after generating 
random samples from any one of the univariate discrete and univariate continuous 
random sampling methods, Rejection Sampling, Adaptive Rejection Sampling and 


M-H algorithms. 
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Function name Algorithm 

MHRW M-H random walk 

MHIND M-H independent 

MHHY M-H hybrid of MHRW and MHIND 
target_exp 


Specifies a target function in expression form from which random samples are needed. 


Range expression Description 
UB Specifies the target as unbounded. 


LB(a) Specifies the target function as left bounded at a. 
RB(b) Specifies the target function as right bounded at b. 
BD(a,b) Specifies the support of target as bounded between a and b. 


proposal exp or initialvalue. exp 


The distributions for proposal or initial values can be chosen from Beta, Cauchy, Chi- 
square, Exponential, F, Gamma, Gompertz, Gumbel, Double Exponential (Laplace), 
Logistic, Logit normal, Lognormal, Normal, Pareto, Rayleigh, t, Triangular, Uniform, 
Inverse Gaussian (Wald) and Weibull distributions. 


Note: The list of distributions along with distribution notation is given in section 
Distribution Notations used in IDMC and MCMC of Monte Carlo. 


/ SIZE=n1 size of the random sample to be generated. The default is 1. 

NSAMPLE = n2 number of samples (columns) each of a specified size to be generated. 
The default is 1. 

BURNIN = n3 size of random samples to be discarded initially from the chain. The 
default is 500. 

GAP = n4 the difference between two successive samples that are extracted from the 
generated sample. The default is 30. 

RSEED = n5 


random seed. 
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Examples: 


To generate random samples using M-H random walk algorithm from unbounded 
target function "exp(-(x*2)/2)" with Cauchy(0,1) as the proposal distribution and an 
initial value from uniform(-1,1) distribution: 

MH MHRW(exp(-(x*2)/2),UB,C(0,1),U(-1,1)) 

To generate random samples using M-H independent algorithm from bounded target 
function "((x+2)41 25)*((1-x)438)*(x434)" with uniform(0,1) the proposal distribution 
and an initial value from uniform(-1,1) distribution: 

MH MHIND(((x+2)4125)*((1 -x)^38)* (x34), BD( 0, 1), U(0,1), U(0,1)) 


INTEG command 


INTEG expression 
Specifies an integrand function for Monte Carlo integration. 


; MC classical Monte Carlo integration. 
IMPSAMPI DENFUN = expression Importance Sampling integration method 
IMPSAMPR DENFUN = expressionImportance Sampling ratio method 


Expression after DENFUN is the numerator of weight function in Importance Sampling 
method. 


Examples: 


For classical Monte Carlo integration of integrand 02” 


INTEG x42 ; MC 


For Importance Sampling integration of integrand “COS((x)/2)” with weight function 1. 


INTEG COS((X)/2) ; IMPSAMPI DENFUN=1 

Note: Monte Carlo Integration methods in SYSTAT can be used only after generating 
random samples from any one of the univariate discrete and univariate continuous 
random sampling methods, Rejection Sampling, Adaptive Rejection Sampling and 


M-H algorithms. 
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GVARIABLE command 


GVARIABLE varlist 


Specifies variables used for Gibbs sampling algorithm. Output will be given for 
variables in varlist specified after GVARIABLE. 


Examples: 


GVARIABLE a, b 


VARIABLE DECLARATIONS 


Specifies initial values for Gibbs variables declared using GVARIABLE command. 
Examples: 
For setting initial values for Gibbs variables a = 0 and b = 0 


a=0 
b=0 


FUNCTION command 


Function specifies full conditional distributions in Gibbs sampling algorithm. Each 


univariate full conditional distribution for all Gibbs variables must be defined using the 
unique function. 


FUNCTION function name() 
{ 
List of statements 


} 


Function name must start with a letter. List of statements involve simple assignment(s) 
to temporary variables and/or Gibbs variables. 
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Example: 
FUNCTION FC10) 
{ 

mu=0 

sigma=1 

a = zm(mu, sigma) 


GIBBS command 


GIBBS( fname_1(),.- . fname k()) 


Specifies the list of all full conditional distributions specified using FUNCTION 
command. Alternatively, you can specify all your expressions as arguments of GIBBS 


(see example listed below). SYSTAT assumes one-one correspondence between 


varlist in GVARIABLE command and GIBBS arguments. 


/ SIZE =n1 size of the random sample to be generated. The default is 1. 
NSAMPLE = n2 number of samples (columns) each of a specified size to be generated. 
The default is 1. 


BURNIN = n3 size of random samples to be discarded initially from the chain. The 
default is 500. 

GAP = n4 the difference between two successive samples that are extracted from the 
generated sample. The default is 30. 

RSEED = n5 random seed. 


Note: The list of distributions along with distribution notation is given in section 
Distribution Notations used in IDMC and MCMC of Monte Carlo. 
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Example: 


For generation of a random sample from a bivariate normal distribution with mean 
vector [0 , 0] and correlation 0.98 using Gibbs sampling. 


MCMC 

X1=0.0 
X2=0.0 
sigma=0.1990 
GVAR X1, X2 


FUNCTION FC1() 
X1= ZRN(0.98*X2, sigma ) 
FUNCTION FC1() 
X1= ZRN(0.98*X2, sigma) 


GIBBS (FC1(),FC2()) /SIZE=10000 NSAMPE=1 BURNIN=500 GAP=50 
RSEED=231 


In the above example, the full conditionals are simple to express as functions, so we 
can give them directly as arguments of GIBBS: 


USE command 


MCMC 

X1=0.0 

X2=0.0 

GVAR X1,X2 

GIBBS ( ZRN(0.98*X2, 0.1990), ZRN(0.98*X1, 


0.1990) /STZE=10000 NSAMPE=1 BURNIN=500 GAP=50, 
RSEED=231 


(HOT) USE filename 


Reads a data file named filename. You do not have to enclose the fi 


lename and path in 


quotation marks unless the path or name contains spaces. In the absence of a 
designated path, the software searches for the file in the directories defined by FPATH 
for input data files (USE), temporary data files (WORK), and output data files (SAVE). 


The date and time of a file's creation appear in the output when that fi 


le is used. 
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/ NAMES suppresses the date and time information, displaying only the names 
of the variables in the data file. 
NONAMES neither the variable names nor the file's date and time are displayed. 
COMMENT displays the file comments after the variable names. 
DICTIONARY displays the file comments, variable names, and variable comments. 
MATRIX or MAT reads the file as a matrix with specified name. 
= matixname 
MTYPE= NUMERIC reads all numeric or string variable(s) as a matrix. By default 
or SYSTAT reads only numeric columns. 
STRING 
ROWNAME uses var (or var$) to name the rows of matrix. 
=var or var$ 
COLNAME uses var (or var$) to name the columns of matrix. 
=var or var$ 
SAVE command 


SAVE filename 


Saves generated samples to a file specified by name filename. 


Expressions in Monte Carlo 


the target functions from which random sample is 
generated are expressions which involve mathematical functions of a single variable. 
The integrand in Monte Carlo Integration and the density function in Importance 
Sampling procedures are expressions. In the Gibbs Sampling method, the parameters 


of full conditionals are expressions, which may involve variables from a data file and 
mathematical functions. 


For IIDMC and M-H algorithms, 


A 
adaptive rejection sampling 4 
ARS 
in REJECT 67 
B 
burn-in 12 
D 
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Benford’s law 14, 17 
Beta 15, 17, 19, 24, 25, 28, 29, 31 
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Exponential 15, 17, 20, 24, 25, 28, 
29,31 
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